AN EMPIRICAL BAYESIAN ANALYSIS OF SIMULTANEOUS CHANGEPOINTS IN MULTIPLE DATA SEQUENCES By

نویسندگان

  • Zhou Fan
  • Lester Mackey
  • ZHOU FAN
چکیده

Motivated by applications in genomics, finance, and biomolecular simulation, we introduce a Bayesian framework for modeling changepoints that tend to co-occur across multiple related data sequences. We infer the locations and sequence memberships of changepoints in our hierarchical model by developing efficient Markov chain Monte Carlo sampling and posterior mode finding algorithms based on dynamic programming recursions. We further propose an empirical Bayesian Monte Carlo expectation-maximization procedure for estimating unknown prior parameters from data. The resulting framework accommodates a broad range of data and changepoint types, including real-valued sequences with changing mean or variance and sequences of counts or binary observations. We demonstrate on simulated data that our changepoint estimation accuracy is competitive with the best methods in the literature, and we apply our methodology to the discovery of DNA copy number variations in cancer cell lines and the analysis of historical price volatility in U.S. stocks.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Finite-State Markov Chains for Multiple Sequences

We consider the analysis of sets of categorical sequences consisting of piecewise homogeneous Markov segments. The sequences are assumed to be governed by a common underlying process with segments occurring in the same order for each sequence. Segments are defined by a set of unobserved changepoints where the positions and number of changepoints can vary from sequence to sequence. We propose a ...

متن کامل

Objective Bayesian Analysis of Multiple Changepoints for Linear Models

This paper deals with the detection of multiple changepoints for independent but non identically distributed observations, which are assumed to be modeled by a linear regression with normal errors. The problem has a natural formulation as a model selection problem and the main difficulty for computing model posterior probabilities is that neither the reference priors nor any form of empirical B...

متن کامل

Joint Bayesian Stochastic Inversion of Well Logs and Seismic Data for Volumetric Uncertainty Analysis

Here in, an application of a new seismic inversion algorithm in one of Iran’s oilfields is described. Stochastic (geostatistical) seismic inversion, as a complementary method to deterministic inversion, is perceived as contribution combination of geostatistics and seismic inversion algorithm. This method integrates information from different data sources with different scales, as prior informat...

متن کامل

Analysis of mitochondrial DNA sequences of Turcinoemacheilus genus (Nemacheilidae Cypriniformes) in Iran

Members of Nemacheilidae Family, Turcinoemacheilus genus were subjected to molecular phylogenetic analysis in this study. This genus was reported in 2009 to inhabit in Karoon River drainage, in contrary to previous assumption that it was the endemic species in the Basin of Tigris River. It was sampled from three stations placed in different tributaries in Karoon drainage and evaluated to unders...

متن کامل

Detecting simultaneous changepoints in multiple sequences.

We discuss the detection of local signals that occur at the same location in multiple one-dimensional noisy sequences, with particular attention to relatively weak signals that may occur in only a fraction of the sequences. We propose simple scan and segmentation algorithms based on the sum of the chi-squared statistics for each individual sample, which is equivalent to the generalized likeliho...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015